skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Christin, Nicolas"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
  2. null (Ed.)
    Illicit website owners frequently rely on traffic distribution systems (TDSs) operated by less-than-scrupulous advertising networks to acquire user traffic. While researchers have described a number of case studies on various TDSs or the businesses they serve, we still lack an understanding of how users are differentiated in these ecosystems, how different illicit activities frequently leverage the same advertisement networks and, subsequently, the same malicious advertisers. We design ODIN (Observatory of Dynamic Illicit ad Networks), the first system to study cloaking, user differentiation and business integration at the same time in four different types of traffic sources: typosquatting, copyright-infringing movie streaming, ad-based URL shortening, and illicit online pharmacy websites. ODIN performed 874,494 scrapes over two months (June 19, 2019–August 24, 2019), posing as six different types of users (e.g., mobile, desktop, and crawler) and accumulating over 2TB of data. We observed 81% more malicious pages compared to using only the best performing crawl profile by itself. Three of the traffic sources we study redirect users to the same traffic broker domain names up to 44% of the time and all of them often expose users to the same malicious advertisers. Our experiments show that novel cloaking techniques could decrease by half the number of malicious pages observed. Worryingly, popular blacklists do not just suffer from the lack of coverage and delayed detection, but miss the vast majority of malicious pages targeting mobile users. We use these findings to design a classifier, which can make precise predictions about the likelihood of a user being redirected to a malicious advertiser. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
    Fringe groups and organizations have a long history of using euphemisms---ordinary-sounding words with a secret meaning---to conceal what they are discussing. Nowadays, one common use of euphemisms is to evade content moderation policies enforced by social media platforms. Existing tools for enforcing policy automatically rely on keyword searches for words on a ``ban list'', but these are notoriously imprecise: even when limited to swearwords, they can still cause embarrassing false positives. When a commonly used ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban list is hopeless: consider ``pot'' (storage container or marijuana?) or ``heater'' (household appliance or firearm?). The current generation of social media companies instead hire staff to check posts manually, but this is expensive, inhumane, and not much more effective. It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is, and therefore whether the message violates policy. Also, when a euphemism is banned, the group that used it need only invent another one, leaving moderators one step behind. This paper will demonstrate unsupervised algorithms that, by analyzing words in their sentence-level context, can both detect words being used euphemistically, and identify the secret meaning of each word. Compared to the existing state of the art, which uses context-free word embeddings, our algorithm for detecting euphemisms achieves 30--400\% higher detection accuracies of unlabeled euphemisms in a text corpus. Our algorithm for revealing euphemistic meanings of words is the first of its kind, as far as we are aware. In the arms race between content moderators and policy evaders, our algorithms may help shift the balance in the direction of the moderators. 
    more » « less
  5. Abstract Monero is a privacy-centric cryptocurrency that allows users to obscure their transactions by including chaff coins, called “mixins,” along with the actual coins they spend. In this paper, we empirically evaluate two weaknesses in Monero’s mixin sampling strategy. First, about 62% of transaction inputs with one or more mixins are vulnerable to “chain-reaction” analysis - that is, the real input can be deduced by elimination. Second, Monero mixins are sampled in such a way that they can be easily distinguished from the real coins by their age distribution; in short, the real input is usually the “newest” input. We estimate that this heuristic can be used to guess the real input with 80% accuracy over all transactions with 1 or more mixins. Next, we turn to the Monero ecosystem and study the importance of mining pools and the former anonymous marketplace AlphaBay on the transaction volume. We find that after removing mining pool activity, there remains a large amount of potentially privacy-sensitive transactions that are affected by these weaknesses. We propose and evaluate two countermeasures that can improve the privacy of future transactions. 
    more » « less